An Effort to Develop a Tagged Lexical Resource for Sanskrit
نویسندگان
چکیده
In this paper we present our efforts the first time of its kind in the history of Sanskrit to design and develop a structured electronic lexical Resource by tagging a Traditional Sanskrit dictionary. We narrate how the whole unstructured raw text of Vaacaspatyam – an encyclopedic type of Sanskrit Dictionary has been tagged to form a user friendly e-lexicon with structured and segregated information through corpus designing methods.
منابع مشابه
An Approach for Grammatical Constructs of Sanskrit Language using Morpheme and Parts- of-Speech Tagging by Sanskrit Corpus
Sanskrit since many thousands of years has been the oriental language of India. It is the base for most of the Indian Languages. Statistical processing of Natural Language is based on corpora (singular corpus). Collection of texts of the written and spoken words is known as Language corpus, which is collected in an organized way, in electronic media for the purpose of linguistic research. It pr...
متن کاملSophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets
India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource. Web browser based interfaces are available for these WordNets, but are not suited for mobile devices which deters people from effectively using this resource. We present our initial work on developing mobile applications and browser extensions to access WordNets for Indian Languages. Ou...
متن کاملStatistical Evaluation of Pronunciation Encoding
In this study we investigate the idea to automatically evaluate newly created pronunciation encodings for being correct or containing a potential error. Using a cascaded triphone detector and phonotactical n-gram modeling with an optimal Bayesian threshold we classify unknown pronunciation transcripts into the classes ’probably faulty’ or ’probably correct’. Transcripts tagged ’probably faulty’...
متن کاملA Collaborative Platform for Sanskrit Processing
Sanskrit, the classical language of India, presents specific challenges for computational linguistics: exact phonetic transcription in writing that obscures word boundaries, rich morphology and an enormous corpus, among others. Recent international cooperation has developed innovative solutions to these problems and significant resources for linguistic research. Solutions include efficient segm...
متن کاملA Distributed Platform for Sanskrit Processing
Sanskrit, the classical language of India, presents specific challenges for computational linguistics: exact phonetic transcription in writing that obscures word boundaries, rich morphology and an enormous corpus, among others. Recent international cooperation has developed innovative solutions to these problems and significant resources for linguistic research. Solutions include efficient segm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008